Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 2050638 |
| Missing cells | 2661654 |
| Missing cells (%) | 7.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 281.6 MiB |
| Average record size in memory | 144.0 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 9 |
| Unsupported | 1 |
currency id has constant value "0.0" | Constant |
country name has a high cardinality: 98 distinct values | High cardinality |
locality name has a high cardinality: 617 distinct values | High cardinality |
market name has a high cardinality: 3235 distinct values | High cardinality |
commodity purchased has a high cardinality: 838 distinct values | High cardinality |
name of currency has a high cardinality: 84 distinct values | High cardinality |
unit of goods measurement has a high cardinality: 125 distinct values | High cardinality |
country id is highly correlated with locality id | High correlation |
locality id is highly correlated with country id | High correlation |
country id is highly correlated with locality id | High correlation |
locality id is highly correlated with country id | High correlation |
market type id is highly correlated with market name.1 and 1 other fields | High correlation |
market name.1 is highly correlated with market type id and 1 other fields | High correlation |
country name is highly correlated with currency id and 1 other fields | High correlation |
currency id is highly correlated with market type id and 3 other fields | High correlation |
name of currency is highly correlated with country name and 1 other fields | High correlation |
country id is highly correlated with country name and 1 other fields | High correlation |
country name is highly correlated with country id and 8 other fields | High correlation |
locality id is highly correlated with country name and 1 other fields | High correlation |
market id is highly correlated with country name and 3 other fields | High correlation |
commodity purchase id is highly correlated with country name and 2 other fields | High correlation |
name of currency is highly correlated with country id and 8 other fields | High correlation |
market type id is highly correlated with country name and 2 other fields | High correlation |
market name.1 is highly correlated with country name and 2 other fields | High correlation |
measurement id is highly correlated with country name and 1 other fields | High correlation |
year recorded is highly correlated with country name and 2 other fields | High correlation |
locality name has 611016 (29.8%) missing values | Missing |
mp_commoditysource has 2050638 (100.0%) missing values | Missing |
price paid is highly skewed (γ1 = 107.5510841) | Skewed |
mp_commoditysource is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
locality id has 26051 (1.3%) zeros | Zeros |
Reproduction
| Analysis started | 2022-04-12 14:43:57.053014 |
|---|---|
| Analysis finished | 2022-04-12 14:45:22.593036 |
| Duration | 1 minute and 25.54 seconds |
| Software version | pandas-profiling v3.1.1 |
| Download configuration | config.json |
| Distinct | 98 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1004.063627 |
| Minimum | 1 |
|---|---|
| Maximum | 70001 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 33 |
| Q1 | 105 |
| median | 150 |
| Q3 | 205 |
| 95-th percentile | 270 |
| Maximum | 70001 |
| Range | 70000 |
| Interquartile range (IQR) | 100 |
Descriptive statistics
| Standard deviation | 7163.518858 |
|---|---|
| Coefficient of variation (CV) | 7.134526806 |
| Kurtosis | 77.12139446 |
| Mean | 1004.063627 |
| Median Absolute Deviation (MAD) | 55 |
| Skewness | 8.747616706 |
| Sum | 2058971027 |
| Variance | 51316002.44 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 205 | 137746 | 6.7% |
| 115 | 137093 | 6.7% |
| 238 | 116588 | 5.7% |
| 196 | 82099 | 4.0% |
| 155 | 73843 | 3.6% |
| 116 | 72437 | 3.5% |
| 138 | 61188 | 3.0% |
| 43 | 60921 | 3.0% |
| 90 | 56971 | 2.8% |
| 181 | 54974 | 2.7% |
| Other values (88) | 1196778 |
| Value | Count | Frequency (%) |
| 1 | 15427 | 0.8% |
| 4 | 1793 | 0.1% |
| 8 | 1272 | 0.1% |
| 12 | 990 | < 0.1% |
| 13 | 20600 | |
| 19 | 125 | < 0.1% |
| 23 | 7758 | 0.4% |
| 26 | 444 | < 0.1% |
| 29 | 39530 | |
| 31 | 346 | < 0.1% |
| Value | Count | Frequency (%) |
| 70001 | 17746 | 0.9% |
| 40765 | 2304 | 0.1% |
| 40764 | 9890 | 0.5% |
| 999 | 22904 | |
| 271 | 10957 | 0.5% |
| 270 | 42793 | |
| 269 | 36806 | |
| 264 | 275 | < 0.1% |
| 263 | 6 | < 0.1% |
| 257 | 46053 |
| Distinct | 98 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 MiB |
| Rwanda | 137746 |
|---|---|
| Bassas da India | 137093 |
| Syrian Arab Republic | 116588 |
| Philippines | 82099 |
| Mali | 73843 |
| Other values (93) |
Length
| Max length | 32 |
|---|---|
| Median length | 7 |
| Mean length | 10.15997168 |
| Min length | 4 |
Characters and Unicode
| Total characters | 20834424 |
|---|---|
| Distinct characters | 53 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Afghanistan |
|---|---|
| 2nd row | Afghanistan |
| 3rd row | Afghanistan |
| 4th row | Afghanistan |
| 5th row | Afghanistan |
Common Values
| Value | Count | Frequency (%) |
| Rwanda | 137746 | 6.7% |
| Bassas da India | 137093 | 6.7% |
| Syrian Arab Republic | 116588 | 5.7% |
| Philippines | 82099 | 4.0% |
| Mali | 73843 | 3.6% |
| Indonesia | 72437 | 3.5% |
| Kyrgyzstan | 61188 | 3.0% |
| Burundi | 60921 | 3.0% |
| Gambia | 56971 | 2.8% |
| Niger | 54974 | 2.7% |
| Other values (88) | 1196778 |
Length
| Value | Count | Frequency (%) |
| republic | 255964 | 8.1% |
| rwanda | 137746 | 4.4% |
| india | 137093 | 4.4% |
| bassas | 137093 | 4.4% |
| da | 137093 | 4.4% |
| of | 117266 | 3.7% |
| syrian | 116588 | 3.7% |
| arab | 116588 | 3.7% |
| philippines | 82099 | 2.6% |
| democratic | 76954 | 2.5% |
| Other values (108) | 1826222 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 2996884 | 14.4% |
| i | 2108754 | 10.1% |
| n | 1607785 | 7.7% |
| e | 1352481 | 6.5% |
| 1090548 | 5.2% | |
| o | 871453 | 4.2% |
| r | 862256 | 4.1% |
| s | 844326 | 4.1% |
| d | 738654 | 3.5% |
| b | 695010 | 3.3% |
| Other values (43) | 7666273 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 16822475 | |
| Uppercase Letter | 2860973 | 13.7% |
| Space Separator | 1090548 | 5.2% |
| Other Punctuation | 37790 | 0.2% |
| Dash Punctuation | 21678 | 0.1% |
| Open Punctuation | 480 | < 0.1% |
| Close Punctuation | 480 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 2996884 | |
| i | 2108754 | |
| n | 1607785 | 9.6% |
| e | 1352481 | 8.0% |
| o | 871453 | 5.2% |
| r | 862256 | 5.1% |
| s | 844326 | 5.0% |
| d | 738654 | 4.4% |
| b | 695010 | 4.1% |
| l | 654687 | 3.9% |
| Other values (16) | 4090185 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 395087 | |
| B | 318749 | |
| S | 266269 | |
| I | 243956 | 8.5% |
| M | 193957 | 6.8% |
| A | 170289 | 6.0% |
| C | 166444 | 5.8% |
| L | 154155 | 5.4% |
| P | 150996 | 5.3% |
| N | 139378 | 4.9% |
| Other values (12) | 661693 |
Space Separator
| Value | Count | Frequency (%) |
| 1090548 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 37790 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 21678 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 480 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 480 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 19683448 | |
| Common | 1150976 | 5.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 2996884 | |
| i | 2108754 | 10.7% |
| n | 1607785 | 8.2% |
| e | 1352481 | 6.9% |
| o | 871453 | 4.4% |
| r | 862256 | 4.4% |
| s | 844326 | 4.3% |
| d | 738654 | 3.8% |
| b | 695010 | 3.5% |
| l | 654687 | 3.3% |
| Other values (38) | 6951158 |
Common
| Value | Count | Frequency (%) |
| 1090548 | ||
| ' | 37790 | 3.3% |
| - | 21678 | 1.9% |
| ( | 480 | < 0.1% |
| ) | 480 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 20834424 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 2996884 | 14.4% |
| i | 2108754 | 10.1% |
| n | 1607785 | 7.7% |
| e | 1352481 | 6.5% |
| 1090548 | 5.2% | |
| o | 871453 | 4.2% |
| r | 862256 | 4.1% |
| s | 844326 | 4.1% |
| d | 738654 | 3.5% |
| b | 695010 | 3.3% |
| Other values (43) | 7666273 |
| Distinct | 894 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26310.71243 |
| Minimum | 0 |
|---|---|
| Maximum | 900022 |
| Zeros | 26051 |
| Zeros (%) | 1.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 618 |
| Q1 | 1510 |
| median | 2156 |
| Q3 | 3433 |
| 95-th percentile | 67161 |
| Maximum | 900022 |
| Range | 900022 |
| Interquartile range (IQR) | 1923 |
Descriptive statistics
| Standard deviation | 115952.8831 |
|---|---|
| Coefficient of variation (CV) | 4.407059802 |
| Kurtosis | 51.08067286 |
| Mean | 26310.71243 |
| Median Absolute Deviation (MAD) | 694 |
| Skewness | 7.176866525 |
| Sum | 5.395374672 × 1010 |
| Variance | 1.34450711 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 21971 | 34770 | 1.7% |
| 21972 | 31610 | 1.5% |
| 21969 | 30805 | 1.5% |
| 21973 | 30032 | 1.5% |
| 0 | 26051 | 1.3% |
| 2240 | 15671 | 0.8% |
| 2216 | 14512 | 0.7% |
| 2842 | 14051 | 0.7% |
| 1285 | 13535 | 0.7% |
| 2834 | 13476 | 0.7% |
| Other values (884) | 1826125 |
| Value | Count | Frequency (%) |
| 0 | 26051 | |
| 272 | 1888 | 0.1% |
| 273 | 210 | < 0.1% |
| 274 | 210 | < 0.1% |
| 275 | 210 | < 0.1% |
| 276 | 240 | < 0.1% |
| 277 | 210 | < 0.1% |
| 278 | 911 | < 0.1% |
| 279 | 210 | < 0.1% |
| 280 | 286 | < 0.1% |
| Value | Count | Frequency (%) |
| 900022 | 2026 | |
| 900019 | 2018 | |
| 900018 | 2023 | |
| 900017 | 2022 | |
| 900016 | 2023 | |
| 900015 | 2021 | |
| 900014 | 1935 | |
| 900012 | 2016 | |
| 900011 | 1531 | |
| 900009 | 1982 |
| Distinct | 617 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 611016 |
| Missing (%) | 29.8% |
| Memory size | 15.6 MiB |
| North/Amajyaruguru | 34770 |
|---|---|
| South/Amajyepfo | 31610 |
| East/Iburasirazuba | 30805 |
| West/Iburengerazuba | 30032 |
| Yobe | 15671 |
| Other values (612) |
Length
| Max length | 43 |
|---|---|
| Median length | 8 |
| Mean length | 10.20118337 |
| Min length | 3 |
Characters and Unicode
| Total characters | 14685848 |
|---|---|
| Distinct characters | 62 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Badakhshan |
|---|---|
| 2nd row | Badakhshan |
| 3rd row | Badakhshan |
| 4th row | Badakhshan |
| 5th row | Badakhshan |
Common Values
| Value | Count | Frequency (%) |
| North/Amajyaruguru | 34770 | 1.7% |
| South/Amajyepfo | 31610 | 1.5% |
| East/Iburasirazuba | 30805 | 1.5% |
| West/Iburengerazuba | 30032 | 1.5% |
| Yobe | 15671 | 0.8% |
| Borno | 14512 | 0.7% |
| Homs | 14051 | 0.7% |
| Northern | 13971 | 0.7% |
| Central River | 13535 | 0.7% |
| Aleppo | 13476 | 0.7% |
| Other values (607) | 1227189 | |
| (Missing) | 611016 |
Length
| Value | Count | Frequency (%) |
| region | 121334 | 6.1% |
| central | 49070 | 2.5% |
| north/amajyaruguru | 34770 | 1.7% |
| south/amajyepfo | 31610 | 1.6% |
| east/iburasirazuba | 30805 | 1.5% |
| west/iburengerazuba | 30032 | 1.5% |
| river | 26883 | 1.3% |
| western | 23963 | 1.2% |
| northern | 23793 | 1.2% |
| southern | 22566 | 1.1% |
| Other values (683) | 1604639 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1838743 | 12.5% |
| o | 970129 | 6.6% |
| e | 947552 | 6.5% |
| r | 935903 | 6.4% |
| i | 851988 | 5.8% |
| n | 829999 | 5.7% |
| u | 744450 | 5.1% |
| t | 659266 | 4.5% |
| 559843 | 3.8% | |
| s | 486276 | 3.3% |
| Other values (52) | 5861699 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11416522 | |
| Uppercase Letter | 2277193 | 15.5% |
| Space Separator | 559843 | 3.8% |
| Other Punctuation | 149631 | 1.0% |
| Close Punctuation | 94836 | 0.6% |
| Open Punctuation | 94836 | 0.6% |
| Dash Punctuation | 60551 | 0.4% |
| Connector Punctuation | 32436 | 0.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 1838743 | |
| o | 970129 | 8.5% |
| e | 947552 | 8.3% |
| r | 935903 | 8.2% |
| i | 851988 | 7.5% |
| n | 829999 | 7.3% |
| u | 744450 | 6.5% |
| t | 659266 | 5.8% |
| s | 486276 | 4.3% |
| l | 416167 | 3.6% |
| Other values (18) | 2736049 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 219573 | 9.6% |
| C | 189159 | 8.3% |
| I | 179307 | 7.9% |
| R | 178809 | 7.9% |
| S | 170411 | 7.5% |
| N | 156315 | 6.9% |
| K | 154031 | 6.8% |
| M | 141234 | 6.2% |
| B | 128562 | 5.6% |
| D | 101222 | 4.4% |
| Other values (16) | 658570 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 137746 | |
| ' | 9908 | 6.6% |
| . | 1977 | 1.3% |
Space Separator
| Value | Count | Frequency (%) |
| 559843 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 94836 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 94836 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 60551 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 32436 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 13693715 | |
| Common | 992133 | 6.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 1838743 | 13.4% |
| o | 970129 | 7.1% |
| e | 947552 | 6.9% |
| r | 935903 | 6.8% |
| i | 851988 | 6.2% |
| n | 829999 | 6.1% |
| u | 744450 | 5.4% |
| t | 659266 | 4.8% |
| s | 486276 | 3.6% |
| l | 416167 | 3.0% |
| Other values (44) | 5013242 |
Common
| Value | Count | Frequency (%) |
| 559843 | ||
| / | 137746 | 13.9% |
| ) | 94836 | 9.6% |
| ( | 94836 | 9.6% |
| - | 60551 | 6.1% |
| _ | 32436 | 3.3% |
| ' | 9908 | 1.0% |
| . | 1977 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14678368 | |
| None | 7480 | 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 1838743 | 12.5% |
| o | 970129 | 6.6% |
| e | 947552 | 6.5% |
| r | 935903 | 6.4% |
| i | 851988 | 5.8% |
| n | 829999 | 5.7% |
| u | 744450 | 5.1% |
| t | 659266 | 4.5% |
| 559843 | 3.8% | |
| s | 486276 | 3.3% |
| Other values (50) | 5854219 |
None
| Value | Count | Frequency (%) |
| é | 6604 | |
| ï | 876 | 11.7% |
| Distinct | 3266 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1591.206603 |
| Minimum | 80 |
|---|---|
| Maximum | 6083 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 MiB |
Quantile statistics
| Minimum | 80 |
|---|---|
| 5-th percentile | 180 |
| Q1 | 644 |
| median | 1441 |
| Q3 | 2331 |
| 95-th percentile | 4298 |
| Maximum | 6083 |
| Range | 6003 |
| Interquartile range (IQR) | 1687 |
Descriptive statistics
| Standard deviation | 1181.314129 |
|---|---|
| Coefficient of variation (CV) | 0.7424014749 |
| Kurtosis | 0.5968683393 |
| Mean | 1591.206603 |
| Median Absolute Deviation (MAD) | 838 |
| Skewness | 0.9625167145 |
| Sum | 3262988726 |
| Variance | 1395503.071 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 840 | 5798 | 0.3% |
| 305 | 5270 | 0.3% |
| 302 | 5270 | 0.3% |
| 303 | 5241 | 0.3% |
| 306 | 5224 | 0.3% |
| 304 | 4596 | 0.2% |
| 671 | 4584 | 0.2% |
| 842 | 4583 | 0.2% |
| 672 | 4568 | 0.2% |
| 680 | 4345 | 0.2% |
| Other values (3256) | 2001159 |
| Value | Count | Frequency (%) |
| 80 | 603 | |
| 81 | 579 | |
| 82 | 550 | |
| 83 | 511 | |
| 84 | 589 | |
| 85 | 552 | |
| 86 | 537 | |
| 87 | 437 | |
| 88 | 525 | |
| 89 | 587 |
| Value | Count | Frequency (%) |
| 6083 | 21 | < 0.1% |
| 6082 | 21 | < 0.1% |
| 6081 | 21 | < 0.1% |
| 6080 | 42 | |
| 6079 | 7 | < 0.1% |
| 6078 | 7 | < 0.1% |
| 6077 | 7 | < 0.1% |
| 6011 | 57 | |
| 6010 | 60 | |
| 6009 | 63 |
| Distinct | 3235 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 MiB |
| National Average | 19748 |
|---|---|
| Bogota | 5798 |
| Dushanbe | 5270 |
| Khujand | 5270 |
| Gharm | 5241 |
| Other values (3230) |
Length
| Max length | 42 |
|---|---|
| Median length | 8 |
| Mean length | 8.593822996 |
| Min length | 2 |
Characters and Unicode
| Total characters | 17622820 |
|---|---|
| Distinct characters | 75 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Fayzabad |
|---|---|
| 2nd row | Fayzabad |
| 3rd row | Fayzabad |
| 4th row | Fayzabad |
| 5th row | Fayzabad |
Common Values
| Value | Count | Frequency (%) |
| National Average | 19748 | 1.0% |
| Bogota | 5798 | 0.3% |
| Dushanbe | 5270 | 0.3% |
| Khujand | 5270 | 0.3% |
| Gharm | 5241 | 0.3% |
| Bokhtar | 5224 | 0.3% |
| Khorog | 4596 | 0.2% |
| Bishkek | 4584 | 0.2% |
| Medellin | 4583 | 0.2% |
| Osh | 4568 | 0.2% |
| Other values (3225) | 1985756 |
Length
| Value | Count | Frequency (%) |
| pasar | 70774 | 2.7% |
| city | 60163 | 2.3% |
| region | 44211 | 1.7% |
| al | 30422 | 1.1% |
| average | 21195 | 0.8% |
| national | 21195 | 0.8% |
| town | 13600 | 0.5% |
| commune | 11141 | 0.4% |
| kota | 10818 | 0.4% |
| santa | 9076 | 0.3% |
| Other values (3513) | 2353069 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 2858930 | |
| i | 1123262 | 6.4% |
| o | 1081994 | 6.1% |
| n | 1065451 | 6.0% |
| e | 956717 | 5.4% |
| r | 923184 | 5.2% |
| u | 828083 | 4.7% |
| 595026 | 3.4% | |
| l | 565982 | 3.2% |
| t | 535595 | 3.0% |
| Other values (65) | 7088596 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 14094635 | |
| Uppercase Letter | 2693395 | 15.3% |
| Space Separator | 595026 | 3.4% |
| Dash Punctuation | 80950 | 0.5% |
| Close Punctuation | 57251 | 0.3% |
| Open Punctuation | 57251 | 0.3% |
| Other Punctuation | 38645 | 0.2% |
| Decimal Number | 5318 | < 0.1% |
| Connector Punctuation | 349 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 2858930 | |
| i | 1123262 | 8.0% |
| o | 1081994 | 7.7% |
| n | 1065451 | 7.6% |
| e | 956717 | 6.8% |
| r | 923184 | 6.5% |
| u | 828083 | 5.9% |
| l | 565982 | 4.0% |
| t | 535595 | 3.8% |
| g | 495695 | 3.5% |
| Other values (22) | 3659742 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 278089 | 10.3% |
| B | 275889 | 10.2% |
| K | 271571 | 10.1% |
| A | 207147 | 7.7% |
| S | 191038 | 7.1% |
| C | 183785 | 6.8% |
| P | 164790 | 6.1% |
| N | 143427 | 5.3% |
| T | 143101 | 5.3% |
| G | 128355 | 4.8% |
| Other values (17) | 706203 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1295 | |
| 5 | 866 | |
| 8 | 774 | |
| 0 | 690 | |
| 3 | 672 | |
| 4 | 487 | 9.2% |
| 1 | 485 | 9.1% |
| 6 | 49 | 0.9% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 23032 | |
| . | 10008 | |
| / | 5605 | 14.5% |
Space Separator
| Value | Count | Frequency (%) |
| 595026 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 80950 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 57251 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 57251 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 349 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 16788030 | |
| Common | 834790 | 4.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 2858930 | |
| i | 1123262 | 6.7% |
| o | 1081994 | 6.4% |
| n | 1065451 | 6.3% |
| e | 956717 | 5.7% |
| r | 923184 | 5.5% |
| u | 828083 | 4.9% |
| l | 565982 | 3.4% |
| t | 535595 | 3.2% |
| g | 495695 | 3.0% |
| Other values (49) | 6353137 |
Common
| Value | Count | Frequency (%) |
| 595026 | ||
| - | 80950 | 9.7% |
| ) | 57251 | 6.9% |
| ( | 57251 | 6.9% |
| ' | 23032 | 2.8% |
| . | 10008 | 1.2% |
| / | 5605 | 0.7% |
| 2 | 1295 | 0.2% |
| 5 | 866 | 0.1% |
| 8 | 774 | 0.1% |
| Other values (6) | 2732 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 17559234 | |
| None | 63586 | 0.4% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 2858930 | |
| i | 1123262 | 6.4% |
| o | 1081994 | 6.2% |
| n | 1065451 | 6.1% |
| e | 956717 | 5.4% |
| r | 923184 | 5.3% |
| u | 828083 | 4.7% |
| 595026 | 3.4% | |
| l | 565982 | 3.2% |
| t | 535595 | 3.1% |
| Other values (58) | 7025010 |
None
| Value | Count | Frequency (%) |
| é | 41441 | |
| è | 14669 | 23.1% |
| ó | 3910 | 6.1% |
| ï | 2075 | 3.3% |
| â | 837 | 1.3% |
| á | 347 | 0.5% |
| É | 307 | 0.5% |
| Distinct | 636 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 220.1166403 |
| Minimum | 50 |
|---|---|
| Maximum | 893 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 MiB |
Quantile statistics
| Minimum | 50 |
|---|---|
| 5-th percentile | 52 |
| Q1 | 73 |
| median | 141 |
| Q3 | 303 |
| 95-th percentile | 680 |
| Maximum | 893 |
| Range | 843 |
| Interquartile range (IQR) | 230 |
Descriptive statistics
| Standard deviation | 193.8962677 |
|---|---|
| Coefficient of variation (CV) | 0.8808796437 |
| Kurtosis | 1.976363931 |
| Mean | 220.1166403 |
| Median Absolute Deviation (MAD) | 77 |
| Skewness | 1.558552774 |
| Sum | 451379547 |
| Variance | 37595.76261 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 73 | 63025 | 3.1% |
| 64 | 59379 | 2.9% |
| 51 | 59311 | 2.9% |
| 67 | 53356 | 2.6% |
| 65 | 50167 | 2.4% |
| 52 | 48675 | 2.4% |
| 58 | 48564 | 2.4% |
| 97 | 46746 | 2.3% |
| 71 | 41692 | 2.0% |
| 114 | 32430 | 1.6% |
| Other values (626) | 1547293 |
| Value | Count | Frequency (%) |
| 50 | 18905 | 0.9% |
| 51 | 59311 | |
| 52 | 48675 | |
| 54 | 5856 | 0.3% |
| 55 | 13593 | 0.7% |
| 56 | 8002 | 0.4% |
| 57 | 1370 | 0.1% |
| 58 | 48564 | |
| 60 | 5752 | 0.3% |
| 61 | 8013 | 0.4% |
| Value | Count | Frequency (%) |
| 893 | 3113 | |
| 887 | 608 | < 0.1% |
| 886 | 52 | < 0.1% |
| 885 | 1037 | 0.1% |
| 884 | 592 | < 0.1% |
| 883 | 820 | < 0.1% |
| 882 | 927 | < 0.1% |
| 881 | 929 | < 0.1% |
| 880 | 934 | < 0.1% |
| 879 | 934 | < 0.1% |
| Distinct | 838 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 MiB |
| Millet - Retail | 55898 |
|---|---|
| Rice (imported) - Retail | 53601 |
| Maize - Retail | 48596 |
| Sorghum - Retail | 46507 |
| Wheat flour - Retail | 46360 |
| Other values (833) |
Length
| Max length | 55 |
|---|---|
| Median length | 21 |
| Mean length | 21.92605618 |
| Min length | 12 |
Characters and Unicode
| Total characters | 44962404 |
|---|---|
| Distinct characters | 65 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 6 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Bread - Retail |
|---|---|
| 2nd row | Bread - Retail |
| 3rd row | Bread - Retail |
| 4th row | Bread - Retail |
| 5th row | Bread - Retail |
Common Values
| Value | Count | Frequency (%) |
| Millet - Retail | 55898 | 2.7% |
| Rice (imported) - Retail | 53601 | 2.6% |
| Maize - Retail | 48596 | 2.4% |
| Sorghum - Retail | 46507 | 2.3% |
| Wheat flour - Retail | 46360 | 2.3% |
| Sugar - Retail | 46082 | 2.2% |
| Maize (white) - Retail | 41717 | 2.0% |
| Rice - Retail | 40290 | 2.0% |
| Rice (local) - Retail | 37815 | 1.8% |
| Tomatoes - Retail | 31364 | 1.5% |
| Other values (828) | 1602408 |
Length
| Value | Count | Frequency (%) |
| 2050638 | ||
| retail | 1878421 | |
| rice | 250669 | 3.2% |
| maize | 175816 | 2.2% |
| wholesale | 171305 | 2.2% |
| white | 120355 | 1.5% |
| meat | 118460 | 1.5% |
| oil | 117255 | 1.5% |
| beans | 109130 | 1.4% |
| flour | 104270 | 1.3% |
| Other values (479) | 2840906 |
Most occurring characters
| Value | Count | Frequency (%) |
| 5887517 | ||
| e | 4926422 | 11.0% |
| a | 4098395 | 9.1% |
| l | 3673779 | 8.2% |
| i | 3570863 | 7.9% |
| t | 3214451 | 7.1% |
| R | 2129741 | 4.7% |
| - | 2112496 | 4.7% |
| o | 1399825 | 3.1% |
| s | 1361732 | 3.0% |
| Other values (55) | 12587183 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 30219451 | |
| Space Separator | 5887517 | 13.1% |
| Uppercase Letter | 4141199 | 9.2% |
| Dash Punctuation | 2112496 | 4.7% |
| Open Punctuation | 1168761 | 2.6% |
| Close Punctuation | 1168761 | 2.6% |
| Other Punctuation | 257618 | 0.6% |
| Decimal Number | 6591 | < 0.1% |
| Math Symbol | 10 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 4926422 | |
| a | 4098395 | |
| l | 3673779 | |
| i | 3570863 | |
| t | 3214451 | |
| o | 1399825 | 4.6% |
| s | 1361732 | 4.5% |
| r | 1140253 | 3.8% |
| n | 885881 | 2.9% |
| h | 835541 | 2.8% |
| Other values (16) | 5112309 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 2129741 | |
| M | 401078 | 9.7% |
| W | 293377 | 7.1% |
| S | 234389 | 5.7% |
| B | 184606 | 4.5% |
| O | 173190 | 4.2% |
| C | 160938 | 3.9% |
| F | 120663 | 2.9% |
| P | 110426 | 2.7% |
| T | 71282 | 1.7% |
| Other values (15) | 261509 | 6.3% |
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 1307 | |
| 5 | 1307 | |
| 0 | 1182 | |
| 8 | 1024 | |
| 1 | 1015 | |
| 2 | 756 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 244443 | |
| ' | 7764 | 3.0% |
| / | 5411 | 2.1% |
Space Separator
| Value | Count | Frequency (%) |
| 5887517 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2112496 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1168761 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1168761 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 10 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 34360650 | |
| Common | 10601754 | 23.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 4926422 | |
| a | 4098395 | |
| l | 3673779 | |
| i | 3570863 | |
| t | 3214451 | |
| R | 2129741 | 6.2% |
| o | 1399825 | 4.1% |
| s | 1361732 | 4.0% |
| r | 1140253 | 3.3% |
| n | 885881 | 2.6% |
| Other values (41) | 7959308 |
Common
| Value | Count | Frequency (%) |
| 5887517 | ||
| - | 2112496 | 19.9% |
| ( | 1168761 | 11.0% |
| ) | 1168761 | 11.0% |
| , | 244443 | 2.3% |
| ' | 7764 | 0.1% |
| / | 5411 | 0.1% |
| 9 | 1307 | < 0.1% |
| 5 | 1307 | < 0.1% |
| 0 | 1182 | < 0.1% |
| Other values (4) | 2805 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 44962404 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5887517 | ||
| e | 4926422 | 11.0% |
| a | 4098395 | 9.1% |
| l | 3673779 | 8.2% |
| i | 3570863 | 7.9% |
| t | 3214451 | 7.1% |
| R | 2129741 | 4.7% |
| - | 2112496 | 4.7% |
| o | 1399825 | 3.1% |
| s | 1361732 | 3.0% |
| Other values (55) | 12587183 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 MiB |
| 0.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 6151914 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 2050638 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0.0 | 2050638 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4101276 | |
| . | 2050638 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 4101276 | |
| Other Punctuation | 2050638 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4101276 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2050638 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6151914 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 4101276 | |
| . | 2050638 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6151914 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 4101276 | |
| . | 2050638 |
| Distinct | 84 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 MiB |
| XOF | |
|---|---|
| RWF | |
| INR | |
| SYP | 116588 |
| PHP | 82099 |
| Other values (79) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 6151914 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | AFN |
|---|---|
| 2nd row | AFN |
| 3rd row | AFN |
| 4th row | AFN |
| 5th row | AFN |
Common Values
| Value | Count | Frequency (%) |
| XOF | 270646 | 13.2% |
| RWF | 137746 | 6.7% |
| INR | 137093 | 6.7% |
| SYP | 116588 | 5.7% |
| PHP | 82099 | 4.0% |
| IDR | 72437 | 3.5% |
| KGS | 61188 | 3.0% |
| BIF | 60921 | 3.0% |
| XAF | 59853 | 2.9% |
| GMD | 56971 | 2.8% |
| Other values (74) | 995096 |
Length
| Value | Count | Frequency (%) |
| xof | 270646 | 13.2% |
| rwf | 137746 | 6.7% |
| inr | 137093 | 6.7% |
| syp | 116588 | 5.7% |
| php | 82099 | 4.0% |
| idr | 72437 | 3.5% |
| kgs | 61188 | 3.0% |
| bif | 60921 | 3.0% |
| xaf | 59853 | 2.9% |
| gmd | 56971 | 2.8% |
| Other values (74) | 995096 |
Most occurring characters
| Value | Count | Frequency (%) |
| F | 609467 | 9.9% |
| R | 476439 | 7.7% |
| S | 442070 | 7.2% |
| P | 407912 | 6.6% |
| O | 374804 | 6.1% |
| N | 373245 | 6.1% |
| D | 362213 | 5.9% |
| X | 342207 | 5.6% |
| I | 322785 | 5.2% |
| M | 267172 | 4.3% |
| Other values (16) | 2173600 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 6151914 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 609467 | 9.9% |
| R | 476439 | 7.7% |
| S | 442070 | 7.2% |
| P | 407912 | 6.6% |
| O | 374804 | 6.1% |
| N | 373245 | 6.1% |
| D | 362213 | 5.9% |
| X | 342207 | 5.6% |
| I | 322785 | 5.2% |
| M | 267172 | 4.3% |
| Other values (16) | 2173600 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6151914 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 609467 | 9.9% |
| R | 476439 | 7.7% |
| S | 442070 | 7.2% |
| P | 407912 | 6.6% |
| O | 374804 | 6.1% |
| N | 373245 | 6.1% |
| D | 362213 | 5.9% |
| X | 342207 | 5.6% |
| I | 322785 | 5.2% |
| M | 267172 | 4.3% |
| Other values (16) | 2173600 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6151914 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| F | 609467 | 9.9% |
| R | 476439 | 7.7% |
| S | 442070 | 7.2% |
| P | 407912 | 6.6% |
| O | 374804 | 6.1% |
| N | 373245 | 6.1% |
| D | 362213 | 5.9% |
| X | 342207 | 5.6% |
| I | 322785 | 5.2% |
| M | 267172 | 4.3% |
| Other values (16) | 2173600 |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 MiB |
| 15 | |
|---|---|
| 14 | 171305 |
| 18 | 664 |
| 17 | 248 |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 4101276 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 15 |
|---|---|
| 2nd row | 15 |
| 3rd row | 15 |
| 4th row | 15 |
| 5th row | 15 |
Common Values
| Value | Count | Frequency (%) |
| 15 | 1878421 | |
| 14 | 171305 | 8.4% |
| 18 | 664 | < 0.1% |
| 17 | 248 | < 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 15 | 1878421 | |
| 14 | 171305 | 8.4% |
| 18 | 664 | < 0.1% |
| 17 | 248 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2050638 | |
| 5 | 1878421 | |
| 4 | 171305 | 4.2% |
| 8 | 664 | < 0.1% |
| 7 | 248 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 4101276 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2050638 | |
| 5 | 1878421 | |
| 4 | 171305 | 4.2% |
| 8 | 664 | < 0.1% |
| 7 | 248 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 4101276 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2050638 | |
| 5 | 1878421 | |
| 4 | 171305 | 4.2% |
| 8 | 664 | < 0.1% |
| 7 | 248 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4101276 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2050638 | |
| 5 | 1878421 | |
| 4 | 171305 | 4.2% |
| 8 | 664 | < 0.1% |
| 7 | 248 | < 0.1% |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 MiB |
| Retail | |
|---|---|
| Wholesale | 171305 |
| Farm Gate | 664 |
| Producer | 248 |
Length
| Max length | 9 |
|---|---|
| Median length | 6 |
| Mean length | 6.251825529 |
| Min length | 6 |
Characters and Unicode
| Total characters | 12820231 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Retail |
|---|---|
| 2nd row | Retail |
| 3rd row | Retail |
| 4th row | Retail |
| 5th row | Retail |
Common Values
| Value | Count | Frequency (%) |
| Retail | 1878421 | |
| Wholesale | 171305 | 8.4% |
| Farm Gate | 664 | < 0.1% |
| Producer | 248 | < 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| retail | 1878421 | |
| wholesale | 171305 | 8.4% |
| farm | 664 | < 0.1% |
| gate | 664 | < 0.1% |
| producer | 248 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 2221943 | |
| l | 2221031 | |
| a | 2051054 | |
| t | 1879085 | |
| R | 1878421 | |
| i | 1878421 | |
| o | 171553 | 1.3% |
| s | 171305 | 1.3% |
| h | 171305 | 1.3% |
| W | 171305 | 1.3% |
| Other values (9) | 4808 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10768265 | |
| Uppercase Letter | 2051302 | 16.0% |
| Space Separator | 664 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2221943 | |
| l | 2221031 | |
| a | 2051054 | |
| t | 1879085 | |
| i | 1878421 | |
| o | 171553 | 1.6% |
| s | 171305 | 1.6% |
| h | 171305 | 1.6% |
| r | 1160 | < 0.1% |
| m | 664 | < 0.1% |
| Other values (3) | 744 | < 0.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 1878421 | |
| W | 171305 | 8.4% |
| F | 664 | < 0.1% |
| G | 664 | < 0.1% |
| P | 248 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 664 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12819567 | |
| Common | 664 | < 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 2221943 | |
| l | 2221031 | |
| a | 2051054 | |
| t | 1879085 | |
| R | 1878421 | |
| i | 1878421 | |
| o | 171553 | 1.3% |
| s | 171305 | 1.3% |
| h | 171305 | 1.3% |
| W | 171305 | 1.3% |
| Other values (8) | 4144 | < 0.1% |
Common
| Value | Count | Frequency (%) |
| 664 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12820231 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 2221943 | |
| l | 2221031 | |
| a | 2051054 | |
| t | 1879085 | |
| R | 1878421 | |
| i | 1878421 | |
| o | 171553 | 1.3% |
| s | 171305 | 1.3% |
| h | 171305 | 1.3% |
| W | 171305 | 1.3% |
| Other values (9) | 4808 | < 0.1% |
| Distinct | 125 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.86923777 |
| Minimum | 5 |
|---|---|
| Maximum | 175 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 MiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 5 |
| median | 5 |
| Q3 | 9 |
| 95-th percentile | 69 |
| Maximum | 175 |
| Range | 170 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 25.98689567 |
|---|---|
| Coefficient of variation (CV) | 1.747695213 |
| Kurtosis | 13.42317863 |
| Mean | 14.86923777 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.53663878 |
| Sum | 30491424 |
| Variance | 675.3187466 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 1523770 | |
| 15 | 138409 | 6.7% |
| 9 | 51612 | 2.5% |
| 33 | 38615 | 1.9% |
| 51 | 17214 | 0.8% |
| 61 | 14848 | 0.7% |
| 22 | 14036 | 0.7% |
| 17 | 13261 | 0.6% |
| 108 | 11793 | 0.6% |
| 30 | 11616 | 0.6% |
| Other values (115) | 215464 | 10.5% |
| Value | Count | Frequency (%) |
| 5 | 1523770 | |
| 9 | 51612 | 2.5% |
| 14 | 7634 | 0.4% |
| 15 | 138409 | 6.7% |
| 16 | 6622 | 0.3% |
| 17 | 13261 | 0.6% |
| 18 | 5293 | 0.3% |
| 19 | 158 | < 0.1% |
| 20 | 1893 | 0.1% |
| 21 | 2252 | 0.1% |
| Value | Count | Frequency (%) |
| 175 | 60 | < 0.1% |
| 171 | 17 | < 0.1% |
| 170 | 158 | < 0.1% |
| 169 | 1011 | < 0.1% |
| 168 | 820 | < 0.1% |
| 167 | 927 | < 0.1% |
| 166 | 929 | < 0.1% |
| 164 | 804 | < 0.1% |
| 163 | 820 | < 0.1% |
| 161 | 2630 |
| Distinct | 125 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 15.6 MiB |
| KG | |
|---|---|
| L | 138409 |
| 100 KG | 51612 |
| Unit | 38615 |
| Day | 17214 |
| Other values (120) |
Length
| Max length | 11 |
|---|---|
| Median length | 2 |
| Mean length | 2.535615745 |
| Min length | 1 |
Characters and Unicode
| Total characters | 5199630 |
|---|---|
| Distinct characters | 47 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | KG |
|---|---|
| 2nd row | KG |
| 3rd row | KG |
| 4th row | KG |
| 5th row | KG |
Common Values
| Value | Count | Frequency (%) |
| KG | 1523770 | |
| L | 138409 | 6.7% |
| 100 KG | 51612 | 2.5% |
| Unit | 38615 | 1.9% |
| Day | 17214 | 0.8% |
| Head | 14848 | 0.7% |
| 50 KG | 14036 | 0.7% |
| 90 KG | 13261 | 0.6% |
| 46 KG | 11793 | 0.6% |
| Pound | 11616 | 0.6% |
| Other values (115) | 215464 | 10.5% |
Length
| Value | Count | Frequency (%) |
| kg | 1686335 | |
| l | 146332 | 6.4% |
| 100 | 57058 | 2.5% |
| g | 42697 | 1.9% |
| unit | 38615 | 1.7% |
| pcs | 19493 | 0.8% |
| day | 17214 | 0.7% |
| 50 | 15423 | 0.7% |
| head | 14848 | 0.6% |
| 90 | 13261 | 0.6% |
| Other values (98) | 249996 | 10.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| G | 1735945 | |
| K | 1686335 | |
| 250634 | 4.8% | |
| 0 | 243595 | 4.7% |
| L | 175423 | 3.4% |
| 1 | 130679 | 2.5% |
| 5 | 73943 | 1.4% |
| a | 65355 | 1.3% |
| n | 62350 | 1.2% |
| U | 58862 | 1.1% |
| Other values (37) | 716509 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 3784150 | |
| Decimal Number | 600107 | 11.5% |
| Lowercase Letter | 514227 | 9.9% |
| Space Separator | 250634 | 4.8% |
| Other Punctuation | 50512 | 1.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 65355 | |
| n | 62350 | |
| i | 54468 | |
| t | 52922 | |
| e | 37754 | 7.3% |
| c | 31798 | 6.2% |
| d | 29122 | 5.7% |
| o | 25645 | 5.0% |
| s | 25446 | 4.9% |
| u | 21848 | 4.2% |
| Other values (11) | 107519 |
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 1735945 | |
| K | 1686335 | |
| L | 175423 | 4.6% |
| U | 58862 | 1.6% |
| D | 28213 | 0.7% |
| M | 24424 | 0.6% |
| P | 19232 | 0.5% |
| H | 14882 | 0.4% |
| C | 14453 | 0.4% |
| S | 12774 | 0.3% |
| Other values (3) | 13607 | 0.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 243595 | |
| 1 | 130679 | |
| 5 | 73943 | 12.3% |
| 2 | 36896 | 6.1% |
| 3 | 30192 | 5.0% |
| 4 | 28892 | 4.8% |
| 9 | 21664 | 3.6% |
| 6 | 19491 | 3.2% |
| 8 | 7745 | 1.3% |
| 7 | 7010 | 1.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 40235 | |
| / | 10277 | 20.3% |
Space Separator
| Value | Count | Frequency (%) |
| 250634 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4298377 | |
| Common | 901253 | 17.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 1735945 | |
| K | 1686335 | |
| L | 175423 | 4.1% |
| a | 65355 | 1.5% |
| n | 62350 | 1.5% |
| U | 58862 | 1.4% |
| i | 54468 | 1.3% |
| t | 52922 | 1.2% |
| e | 37754 | 0.9% |
| c | 31798 | 0.7% |
| Other values (24) | 337165 | 7.8% |
Common
| Value | Count | Frequency (%) |
| 250634 | ||
| 0 | 243595 | |
| 1 | 130679 | |
| 5 | 73943 | 8.2% |
| . | 40235 | 4.5% |
| 2 | 36896 | 4.1% |
| 3 | 30192 | 3.4% |
| 4 | 28892 | 3.2% |
| 9 | 21664 | 2.4% |
| 6 | 19491 | 2.2% |
| Other values (3) | 25032 | 2.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5199630 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| G | 1735945 | |
| K | 1686335 | |
| 250634 | 4.8% | |
| 0 | 243595 | 4.7% |
| L | 175423 | 3.4% |
| 1 | 130679 | 2.5% |
| 5 | 73943 | 1.4% |
| a | 65355 | 1.3% |
| n | 62350 | 1.2% |
| U | 58862 | 1.1% |
| Other values (37) | 716509 |
month recorded
Real number (ℝ≥0)
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.363020679 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 6 |
| Q3 | 9 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.403188688 |
|---|---|
| Coefficient of variation (CV) | 0.5348385397 |
| Kurtosis | -1.179686369 |
| Mean | 6.363020679 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.05451027146 |
| Sum | 13048252 |
| Variance | 11.58169325 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 182132 | |
| 5 | 181862 | |
| 3 | 179884 | |
| 7 | 177613 | |
| 4 | 176659 | |
| 2 | 173404 | |
| 1 | 172090 | |
| 8 | 166226 | |
| 10 | 165525 | |
| 9 | 164928 | |
| Other values (2) | 310315 |
| Value | Count | Frequency (%) |
| 1 | 172090 | |
| 2 | 173404 | |
| 3 | 179884 | |
| 4 | 176659 | |
| 5 | 181862 | |
| 6 | 182132 | |
| 7 | 177613 | |
| 8 | 166226 | |
| 9 | 164928 | |
| 10 | 165525 |
| Value | Count | Frequency (%) |
| 12 | 154798 | |
| 11 | 155517 | |
| 10 | 165525 | |
| 9 | 164928 | |
| 8 | 166226 | |
| 7 | 177613 | |
| 6 | 182132 | |
| 5 | 181862 | |
| 4 | 176659 | |
| 3 | 179884 |
| Distinct | 32 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2016.130844 |
| Minimum | 1990 |
|---|---|
| Maximum | 2021 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 MiB |
Quantile statistics
| Minimum | 1990 |
|---|---|
| 5-th percentile | 2008 |
| Q1 | 2014 |
| median | 2017 |
| Q3 | 2020 |
| 95-th percentile | 2021 |
| Maximum | 2021 |
| Range | 31 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.458825267 |
|---|---|
| Coefficient of variation (CV) | 0.002211575346 |
| Kurtosis | 2.220083238 |
| Mean | 2016.130844 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -1.344576832 |
| Sum | 4134354521 |
| Variance | 19.88112276 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2020 | 395781 | |
| 2021 | 203084 | |
| 2019 | 202032 | |
| 2018 | 183970 | |
| 2017 | 173234 | |
| 2016 | 147333 | 7.2% |
| 2015 | 135859 | 6.6% |
| 2014 | 121016 | 5.9% |
| 2013 | 107881 | 5.3% |
| 2012 | 86311 | 4.2% |
| Other values (22) | 294137 |
| Value | Count | Frequency (%) |
| 1990 | 140 | < 0.1% |
| 1991 | 134 | < 0.1% |
| 1992 | 280 | < 0.1% |
| 1993 | 284 | < 0.1% |
| 1994 | 1537 | |
| 1995 | 1287 | |
| 1996 | 2022 | |
| 1997 | 1638 | |
| 1998 | 1792 | |
| 1999 | 1713 |
| Value | Count | Frequency (%) |
| 2021 | 203084 | |
| 2020 | 395781 | |
| 2019 | 202032 | |
| 2018 | 183970 | |
| 2017 | 173234 | |
| 2016 | 147333 | 7.2% |
| 2015 | 135859 | 6.6% |
| 2014 | 121016 | 5.9% |
| 2013 | 107881 | 5.3% |
| 2012 | 86311 | 4.2% |
| Distinct | 239811 |
|---|---|
| Distinct (%) | 11.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6413.983952 |
| Minimum | 0 |
|---|---|
| Maximum | 21777780 |
| Zeros | 34 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 45 |
| median | 246.55585 |
| Q3 | 1200 |
| 95-th percentile | 22000 |
| Maximum | 21777780 |
| Range | 21777780 |
| Interquartile range (IQR) | 1155 |
Descriptive statistics
| Standard deviation | 106977.235 |
|---|---|
| Coefficient of variation (CV) | 16.67875002 |
| Kurtosis | 14317.0661 |
| Mean | 6413.983952 |
| Median Absolute Deviation (MAD) | 232.69585 |
| Skewness | 107.5510841 |
| Sum | 1.315275922 × 1010 |
| Variance | 1.144412881 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 200 | 23571 | 1.1% |
| 500 | 22600 | 1.1% |
| 300 | 21174 | 1.0% |
| 400 | 18858 | 0.9% |
| 1000 | 18673 | 0.9% |
| 250 | 18153 | 0.9% |
| 100 | 15307 | 0.7% |
| 50 | 14465 | 0.7% |
| 350 | 13828 | 0.7% |
| 150 | 13606 | 0.7% |
| Other values (239801) | 1870403 |
| Value | Count | Frequency (%) |
| 0 | 34 | |
| 0.01 | 4 | < 0.1% |
| 0.0125 | 1 | < 0.1% |
| 0.02 | 1 | < 0.1% |
| 0.09 | 1 | < 0.1% |
| 0.1 | 28 | |
| 0.1001 | 2 | < 0.1% |
| 0.1005 | 1 | < 0.1% |
| 0.105 | 1 | < 0.1% |
| 0.1078 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 21777780 | 1 | < 0.1% |
| 19777777 | 1 | < 0.1% |
| 18666666 | 1 | < 0.1% |
| 17250000 | 1 | < 0.1% |
| 17200000 | 1 | < 0.1% |
| 17000000 | 10 | |
| 16250000 | 1 | < 0.1% |
| 16000000 | 1 | < 0.1% |
| 15000000 | 4 | < 0.1% |
| 14800000 | 1 | < 0.1% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| country id | country name | locality id | locality name | market id | market name | commodity purchase id | commodity purchased | currency id | name of currency | market type id | market name.1 | measurement id | unit of goods measurement | month recorded | year recorded | price paid | mp_commoditysource | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 1 | 2014 | 50.0 | NaN |
| 1 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 2 | 2014 | 50.0 | NaN |
| 2 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 3 | 2014 | 50.0 | NaN |
| 3 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 4 | 2014 | 50.0 | NaN |
| 4 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 5 | 2014 | 50.0 | NaN |
| 5 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 6 | 2014 | 50.0 | NaN |
| 6 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 7 | 2014 | 50.0 | NaN |
| 7 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 8 | 2014 | 50.0 | NaN |
| 8 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 9 | 2014 | 50.0 | NaN |
| 9 | 1.0 | Afghanistan | 272 | Badakhshan | 266 | Fayzabad | 55 | Bread - Retail | 0.0 | AFN | 15 | Retail | 5 | KG | 10 | 2014 | 50.0 | NaN |
Last rows
| country id | country name | locality id | locality name | market id | market name | commodity purchase id | commodity purchased | currency id | name of currency | market type id | market name.1 | measurement id | unit of goods measurement | month recorded | year recorded | price paid | mp_commoditysource | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2050628 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 52 | Rice - Retail | 0.0 | ZWL | 15 | Retail | 5 | KG | 6 | 2021 | 110.6250 | NaN |
| 2050629 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 54 | Maize meal - Retail | 0.0 | ZWL | 15 | Retail | 5 | KG | 6 | 2021 | 50.0000 | NaN |
| 2050630 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 96 | Oil (vegetable) - Retail | 0.0 | ZWL | 15 | Retail | 15 | L | 6 | 2021 | 197.0000 | NaN |
| 2050631 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 97 | Sugar - Retail | 0.0 | ZWL | 15 | Retail | 5 | KG | 6 | 2021 | 118.3750 | NaN |
| 2050632 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 185 | Salt - Retail | 0.0 | ZWL | 15 | Retail | 5 | KG | 6 | 2021 | 71.0000 | NaN |
| 2050633 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 432 | Beans (sugar) - Retail | 0.0 | ZWL | 15 | Retail | 5 | KG | 6 | 2021 | 233.3333 | NaN |
| 2050634 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 539 | Toothpaste - Retail | 0.0 | ZWL | 15 | Retail | 116 | 100 ML | 6 | 2021 | 112.5000 | NaN |
| 2050635 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 540 | Laundry soap - Retail | 0.0 | ZWL | 15 | Retail | 5 | KG | 6 | 2021 | 114.0000 | NaN |
| 2050636 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 541 | Handwash soap - Retail | 0.0 | ZWL | 15 | Retail | 66 | 250 G | 6 | 2021 | 59.5000 | NaN |
| 2050637 | 271.0 | Zimbabwe | 3444 | Midlands | 5594 | Mbilashaba | 887 | Fish (kapenta) - Retail | 0.0 | ZWL | 15 | Retail | 5 | KG | 6 | 2021 | 1200.0000 | NaN |